!git clone https://github.com/Plachtaa/seed-vc.gitSeed voice conversion
zero-shot voice conversion and zero-shot singing voice conversion. Github
dependencies
%cd seed-vc
!pip install -r requirements.txt!pip install --upgrade protobuf===5.29.3#run if kernel restarted
%cd /content/seed-vc/singing voice conversion
download songs
!mkdir input
!wget -O input/japanese.wav https://plachtaa.github.io/seed-vc/demos/references/teio_0.wav
!wget -O input/seeyouagain.wav https://huggingface.co/spaces/Plachta/Seed-VC/resolve/main/examples/source/Wiz%20Khalifa%2CCharlie%20Puth%20-%20See%20You%20Again%20%5Bvocals%5D_%5Bcut_28sec%5D.wav
!wget -O input/trump.wav https://plachtaa.github.io/seed-vc/demos/references/trump_0.wav
!wget -O input/ref_song.wav https://plachtaa.github.io/seed-vc/demos/sources/%E4%B8%96%E7%95%8C%E8%BF%98%E5%B0%8F.wav
!wget -O input/dingzhen.wav https://plachtaa.github.io/seed-vc/demos/references/dingzhen_0.wavrun conversion
# use webui
!python app_svc.py --fp16 True --share True# or use command line
# --diffusion-steps 100 for best quality
# --f0-condition True for SVC
!python inference.py --source /content/seed-vc/input/ref_song.wav --target /content/seed-vc/input/japanese.wav --output output \
--diffusion-steps 100 \
--length-adjust 1.0 \
--inference-cfg-rate 0.7 \
--f0-condition True \
--auto-f0-adjust False \
--semi-tone-shift 0 \
--fp16 True2025-04-23 14:36:13.564924: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1745418973.581998 15926 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745418973.587121 15926 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-23 14:36:13.609608: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Warning: Skipped loading some keys due to shape mismatch: {'estimator.conv2.bias', 'estimator.res_projection.bias', 'estimator.input_pos', 'estimator.t_embedder2.mlp.2.bias', 'estimator.conv2.weight', 'estimator.t_embedder2.mlp.0.weight', 'estimator.t_embedder2.mlp.2.weight', 'estimator.conv1.weight', 'estimator.conv1.bias', 'estimator.res_projection.weight', 'estimator.t_embedder2.mlp.0.bias'}
cfm loaded
length_regulator loaded
Loading weights from nvidia/bigvgan_v2_44khz_128band_512x
Removing weight norm...
It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.
It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.
100% 100/100 [00:24<00:00, 4.04it/s]
100% 100/100 [00:08<00:00, 11.40it/s]
RTF: 1.4874707420275481
display results
# example 1
import os
from IPython.display import Audio
print("target song\n")
display(Audio("/content/seed-vc/input/seeyouagain.wav"))
print("reference voice\n")
display(Audio("/content/seed-vc/input/trump.wav"))
print("converted voice\n")
display(Audio("/content/seed-vc/output/vc_seeyouagain_trump_1.0_100_0.7.wav"))target song
reference voice
converted voice
# example 2
print("target song\n")
display(Audio("/content/seed-vc/input/ref_song.wav"))
print("reference voice\n")
display(Audio("/content/seed-vc/input/dingzhen.wav"))
print("converted voice\n")
display(Audio("/content/seed-vc/output/vc_ref_song_dingzhen_1.0_100_0.7.wav"))target song
reference voice
converted voice
# used inputs
#input_list = os.listdir("/content/seed-vc/input")
#for i in input_list:
# print(i)
# display(Audio("/content/seed-vc/input/"+i))# generated output
#output_list = os.listdir("/content/seed-vc/output")
#for i in output_list:
# print(i)
# display(Audio("/content/seed-vc/output/"+i))